Dynamic construction and management of task pipelines

ABSTRACT

A system and method are disclosed for managing the execution of tasks. Each task in a first set of tasks included in a pipeline is queued for parallel execution. The execution of the tasks is monitored by a dispatching engine. When a particular task that specifies a next set of tasks in the pipeline to be executed has completed, the dispatching engine determines whether the next set of tasks can be executed before the remaining tasks in the first set of tasks have completed. When the next set of tasks can be executed before the remaining tasks have completed, the next set of tasks is queued for parallel execution. When the next set of tasks cannot be executed before the remaining tasks have completed, the next set of tasks is queued for parallel execution only after the remaining tasks have completed.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims priority under 35 U.S.C. §119(e) to U.S.Provisional Patent Application Ser. No. 61/789,504, filed Mar. 15, 2013and entitled “Dynamic Construction and Management of Task Pipelines.”The content of the U.S. Provisional Patent Application Ser. No.61/789,504 is incorporated herein in its entirety.

BACKGROUND

1. Field of Art

The disclosure generally relates to task execution and specifically tothe dynamic construction and management of task pipelines.

2. Description of the Related Art

A server environment performs two types of services, front-end (userfacing) services and back-end (non-user facing) services. Back-endservices include asynchronous and/or short-lived processing jobs, suchas collecting and indexing data from remote systems or processingrequests by the user that take longer than a few seconds. Typically, anexecution infrastructure within the server environment manages theexecution of tasks associated with these back-end services.

The execution infrastructures that exist today have several limitations.These infrastructures do not orchestrate tasks that require thecoordination of multiple different types of tasks, do not seamlesslyhandle retries for failed tasks and do not provide the means to resolverace conditions between different tasks. Further, in some cases, theback-up functionality provided by the execution infrastructure is notrobust to system failures.

Accordingly, there is a need for a system that enables the execution ofmultiple tasks of a pipeline in a robust manner.

BRIEF DESCRIPTION OF DRAWINGS

The disclosed embodiments have advantages and features which will bemore readily apparent from the detailed description, the appendedclaims, and the accompanying figures (or drawings). A brief introductionof the figures is below.

Figure (FIG. 1 illustrates one embodiment of a computing environmentconfigured to aggregate data from several sources and provide theaggregated data to client applications.

FIG. 2 illustrates an embodiment of the data processing engine of FIG.1.

FIG. 3 illustrates an embodiment of a pipeline executed by the dataprocessing engine of FIG. 1.

FIGS. 4A and 4B illustrate an embodiment of a process for managing theexecution of a pipeline of tasks.

FIG. 5 illustrates one embodiment of components of an example machineable to read instructions from a machine-readable medium and executethem in a processor (or controller).

DETAILED DESCRIPTION

The Figures (FIGS.) and the following description relate to preferredembodiments by way of illustration only. It should be noted that fromthe following discussion, alternative embodiments of the structures andmethods disclosed herein will be readily recognized as viablealternatives that may be employed without departing from the principlesof what is claimed.

Reference will now be made in detail to several embodiments, examples ofwhich are illustrated in the accompanying figures. It is noted thatwherever practicable similar or like reference numbers may be used inthe figures and may indicate similar or like functionality. The figuresdepict embodiments of the disclosed system (or method) for purposes ofillustration only. One skilled in the art will readily recognize fromthe following description that alternative embodiments of the structuresand methods illustrated herein may be employed without departing fromthe principles described herein.

Configuration Overview

One embodiment of a disclosed configuration is a system and method formanaging the execution of tasks. Each task in a first set of tasksincluded in a pipeline is queued for parallel execution. The executionof the tasks is monitored by a dispatching engine. When a particulartask that specifies a next set of tasks in the pipeline to be executedhas completed, the dispatching engine determines whether the next set oftasks can be executed before the remaining tasks in the first set oftasks have completed. When the next set of tasks cannot be executedbefore the remaining tasks have completed, the next set of tasks isqueued for parallel execution (parallel within the set of tasks) onlyafter the remaining tasks have completed.

Example Processing Overview

FIG. 1 illustrates one embodiment of a computing environment 100configured to aggregate data from several sources and provide theaggregated data to client applications. As shown, the computingenvironment 100 includes data sources 102, a data aggregation server 106and a client device 108. The data sources 102, the aggregation server106 and the client device 108 are communicatively coupled through anetwork 104. Although only one aggregation server 106 and client device108 are illustrated, the computing environment 100 may include multipleinstances of each entity. Moreover, some of the functions ascribed tothe aggregation server 106 may be performed by the client device 108 andvice versa. Other entities may also be coupled to the network 104.

One or more data source 102(0) . . . 102(N) (generally 102) are a partof a system that manages and stores data associated with individuals orgroups of individuals. For example, a data source 102 may be a contactmanagement system, a customer relationship management (CRM) system or ahuman resource (HR) management system. Each data source 102 stores dataaccording to a fixed database schema. For example, data source 102(0)may store a user's contact data according to a schema that stores arecord per contact, each record being associated with one or more fixedfields. In one embodiment, data storage schemas across different datasources may vary significantly even when storing the same type of data.Each data source 102 provides a channel for accessing and updating datastored within the data source 102.

The data aggregation server 106 includes a data processing engine 110and a server repository 112. The data processing engine 110 accessesdata stored within the data sources 102 via the channels provided byeach data source 102. The data processing engine 110 aggregates relateddata received from the different data sources 102 and organizes theaggregated data into flexible records. A flexible record is a compositeof fields aggregated from a set of related records received from one ormore data sources 102. Each field associated with a flexible recordincludes data received from a particular data source 102 and specifiesthe particular data source 102 as the source of the data. Flexiblerecords are stored in the storage repository 112. Each flexible recordstored in the storage repository 112 is associated with at least oneuser who accesses data via a client device, such as client device 108,communicating with the data aggregation server 106.

In operation, when a user creates an account with the data aggregationserver 106, the user identifies one or more data sources 102 that storedata associated with the user. In one embodiment, the aggregation server106 automatically, without user intervention, identifies the datasources 102 that store data associated with the user based on the user'slocation, name, organization affiliation, etc. The data processingengine 110 retrieves from each identified data source one or morerecords storing data associated with the user. The records retrievedfrom different data sources may store related data but may be structuredaccording to different schemas. The data processing engine 110aggregates the records and stores the aggregated records as flexiblerecords in the storage repository 112. In one embodiment, multiple usersmay be associated with the same data in one or more data sources 102. Insuch an embodiment, the data processing engine 110 does not generatemultiple flexible records storing the same data but associates themultiple users with the same flexible record storing the data.

Data stored in the server repository 112 that is associated with aparticular user is transmitted to the client device 108 operated by theuser for presentation in the data presentation application 114. Datareceived from the server repository 112 is stored in the clientrepository 116. The data presentation application 114 retrieves datastored in the client repository 116 and allows users to view andinteract with the data as well as modify the data if necessary. Anymodifications made to the data are stored in the client repository 116and also transmitted by the data presentation applications 114 to thedata processing engine 110.

The data processing engine 110 tracks all modifications made via thedata presentation application 114 to data that is also stored in theserver repository 112. In one embodiment, the data processing engine 110identifies a particular data field stored in the server repository 112that was modified via the data presentation application 114. The dataprocessing engine 110 transmits the modified data to the data source 102specified in the data field. In such a manner, a data field that ismodified on the client device 108 may be synchronized with the datafield stored in the server repository 112 as well as the data source 102from which the data associated with the data field was originallyretrieved.

The network 104 represents the communication pathways between thebookshelf server 104, client device 112, and any other entities on thenetwork. In one embodiment, the network 104 is the Internet and usesstandard communications technologies and/or protocols. Thus, the network104 can include links using technologies such as Ethernet, 802.11,worldwide interoperability for microwave access (WiMAX), 3G, long termevolution (LTE), digital subscriber line (DSL), asynchronous transfermode (ATM), InfiniBand, PCI Express Advanced Switching, etc. Similarly,the networking protocols used on the network 104 can includemultiprotocol label switching (MPLS), the transmission controlprotocol/Internet protocol (TCP/IP), the User Datagram Protocol (UDP),the hypertext transport protocol (HTTP), the simple mail transferprotocol (SMTP), the file transfer protocol (FTP), etc. The dataexchanged over the network 104 can be represented using technologiesand/or formats including the hypertext markup language (HTML), theextensible markup language (XML), etc. In addition, all or some of linkscan be encrypted using conventional encryption technologies such assecure sockets layer (SSL), transport layer security (TLS), virtualprivate networks (VPNs), Internet Protocol security (IPsec), etc. Inother embodiments, the entities use custom and/or dedicated datacommunications technologies instead of, or in addition to, the onesdescribed above.

FIG. 2 illustrates an embodiment of the data processing engine 110 ofFIG. 1. As shown, the data processing engine 110 includes a taskprocessing engine 202 and a task executor 220. The task processingengine 202 orchestrates the execution of various tasks performed by thedata processing engine 110. Such tasks include, but are not limited to,data aggregation tasks related to data retrieved from the data sources102 or data access tasks for accessing data in the server repository112. The task executor 220 executes commands specified by differenttasks within one or more processors.

The task processing engine 202 includes a task execution module 204, atask list 214, an execution log 216 and a dead funnel set 218. The taskexecution module 204 queues and monitors the execution of tasksperformed by the data processing engine 110. In one embodiment, the taskexecution module 204 executes as a hypertext transfer protocol (HTTP)server that receives tasks for queuing via HTTP post commands. Modulesthat send tasks to the task execution module 204 may also retrieve theexecution status of particular tasks via HTTP get commands. The taskexecution module 204 receives task organized as funnels. A funnelincludes one or more tasks, and multiple tasks in a given funnel can beexecuted in parallel. Further, a funnel may include a nested funnel suchthat tasks included in the funnel are further divided into multiplestages. Two or more funnels may also be sequentially linked to create apipeline of funnels. For tasks in a pipeline, tasks in a subsequentfunnel can be executed only once the tasks in the previous funnel arecomplete. The tasks in a pipeline do not have to be defined or specifiedat the time the task execution module 204 receives the pipeline or aportion thereof. The tasks in the pipeline may be dynamically added asother tasks in the pipeline are executed.

The task execution module 204 includes a queuing module 206, adispatching module 208 and an error handling module 210. The queuingmodule 206 receives funnels of tasks for execution from various modulesin the data processing engine 110. The queuing module 206 queues eachtask of the received funnel in the task list 214. Each task specifies auniquely identifiable key and a command. In one embodiment, a key is acombination of a unique job identifier and the type of task, e.g., anarchive type or an index job type. The command identifies the particularcode to be executed to complete the task as well as any inputs needed toexecute the code. In one embodiment, each task and/or each funnel oftasks is associated with a given user or a processing track. The funnelmay be associated with an identifier that maps to a single user. Inalternate embodiments, the identifier associated with the funnel may bemore coarse or fine grained than being mapped to a single user. Forexample, the identifier may be a source identifier identifying aparticular source associated with the user.

The dispatching module 208 dispatches tasks queued in the task list 214for execution in the task executor 220. Tasks belonging to the samefunnel can be dispatched for execution in parallel. A given task may bedispatched for execution only when all the tasks in a funnel associatedwith a previous stage of the pipeline have completed. To accomplish thisordering of task execution, the dispatching module 208 tracks, for eachtask in the task list 214, other tasks in a pipeline that must completeexecution before that task can be dispatched for execution. When aparticular task completes execution, the dispatching module 208 receivesan indication of the completed execution. The indications may optionallyspecify a next task to be executed once the task completes execution.The queuing module 206 queues the next task for execution. Because atask in a pipeline may specify a next task at completion, the pipelinedoes not have a pre-determined structure and tasks can be dynamicallyincluded in the pipeline during execution, i.e., at run-time.

In one embodiment, a funnel in a pipeline may be marked as a “serial”funnel. When a task in a serial funnel completes execution, thedispatching module 208 analyzes the task list 214 to identify any tasksalready queued that match the uniquely identifiable key of the completedtask. If such tasks exist, the dispatching module 208 dispatches thosetasks for execution before tasks included in subsequent funnels of thepipeline.

The task processing engine 202 also maintains the execution log 216. Theexecution log 216 periodically captures state information related to theexecution of currently dispatched tasks. The information captured by theexecution log 216 includes the identity of tasks that have timed out,tasks that have experienced an error during execution and tasks thathave successfully completed, etc. The error handling module 210 analyzesthe execution log 216 at regular intervals to determine whether a taskneeds to be re-queued for execution, for example, when the task hastimed out or has experienced an error. In one embodiment, the errorhandling module 210 re-queues a given task only a pre-determined numberof times before retiring the entire funnel including the fatal task tothe dead funnel set 218.

In one embodiment, the task processing engine 202 periodically savesprocessing states of each of the currently executing tasks in the serverrepository 112. Consequently, if the task processing engine 202 or thedata aggregation server 106, as a whole, suffers a failure, the taskprocessing engine 202 is able to resurrect processing state from theserver repository 112 and continue to execute dispatched tasks.

FIG. 3 illustrates an embodiment of a pipeline 300 executed by the dataprocessing engine 110 of FIG. 1. As shown, the pipeline 300 includesthree funnels, funnel 314, funnel 316 and funnel 318. Funnel 314includes task 304 and a nested funnel 320 that includes task 302, 304and 306. Funnel 316 includes task 310 and funnel 318 includes task 312.

When tasks included in funnels 314, 316 and 318 are queued in the tasklist 214, the dispatching module 208 dispatches the tasks in funnel 314before dispatching tasks in funnel 316 and 318. Within funnel 314, thedispatching module 208 dispatches tasks 302 and 304 in parallel, butdispatches task 306 only when task 304 completes and task 308 only whentask 306 completes. When task 308 and task 304 complete, the dispatchingmodule 208 dispatches task 310. Finally, when task 310 completes, thedispatching module 208 dispatches task 312.

FIGS. 4A and 4B illustrate an embodiment of a process for managing theexecution of a pipeline of tasks. At step 402, the queuing module 206queues in the task list 214 each task in a funnel included in apipeline. At step 404, the dispatching module 208 dispatches all thetasks included in the funnel that are not gated by the execution ofprevious tasks from the task list 214 to the task executor 220. At step406, the error handling module 210 analyzes the execution log 216 todetermine whether any of the dispatched tasks are in an error state.

If none of the dispatched tasks is in an error state, then the methodproceeds from step 406 to step 408. At step 408, the dispatching module208 determines whether any of the dispatched tasks have completed.Specifically, when a particular task completes execution, thedispatching module 208 receives an indication of the completedexecution. At step 410, the dispatching module 208 determines whetherexecution of the pipeline can advance to a subsequent task or asubsequent funnel in the pipeline or whether to wait on the remainingdispatched tasks.

If the dispatching module 208 determines that the execution can advance,the method proceeds from step 410 to step 412. At step 412, thedispatching module 208 analyzes the indication received when the taskcompleted execution to determine whether a next sub-task to be executedwas specified. If not, the method returns to step 406. If a nextsub-task is specified, however, the method proceeds to step 414, wherethe dispatching module 208 dispatches the next sub-task to the taskexecutor for execution. The method then returns to step 406.

If, step 410, the dispatching module 208 determines that the executioncannot advance, the method proceeds from step 410 to step 416. At step416, the dispatching module 208 waits for the remaining tasks tocomplete before advancing the execution of the pipeline. At step 418,the dispatching module 208 analyzes the indications received when thetasks complete execution to determine whether a next funnel of tasks tobe executed was specified by each of the completed tasks. If so, thedispatching module 208 dispatches the next sub-task to the task executorfor execution. The method then returns to step 406. If, however, a nextfunnel of tasks was not specified, the method ends.

Referring back to step 406, if at least one of the dispatched tasks isin an error state, the method proceeds from step 406 to step 422. Atstep 422, the error handling module 210 determines whether the number oftimes the task in the error state has been retried equals apre-determined threshold. If so, at step 426 the error handling module210 retires the funnel to the dead funnel set 218. If, however, the taskhas not been retried a pre-determined number of times, at step 424 theerror handling module 210 re-queues the task for future dispatch andexecution.

Computing Machine Architecture

The disclosed software structures and processes described with FIGS.1-4B are configured for operation on a machine, e.g., a computingsystem. FIG. 5 is a block diagram illustrating components of an examplemachine able to read instructions from a machine-readable medium andexecute them in one or more processors (or controllers). Specifically,FIG. 5 shows a diagrammatic representation of a machine in the exampleform of a computer system 500 within which instructions 524 (e.g.,software) for causing the machine to perform any one or more of themethodologies discussed herein may be executed. In alternativeembodiments, the machine operates as a standalone device or may beconnected (e.g., networked) to other machines. In a networkeddeployment, the machine may operate in the capacity of a server machineor a client machine in a server-client network environment, or as a peermachine in a peer-to-peer (or distributed) network environment.

The machine for this configuration may be a computing server or acomputing server architecture. In addition, devices such as a mobilecomputing device may apply. For example, a a tablet computer, anultrabook (or netbook) computer, a personal digital assistant (PDA), acellular telephone, a smartphone, a web appliance, or like machinecapable of executing instructions 524 (sequential or otherwise) thatspecify actions to be taken by that machine. Further, while only asingle machine is illustrated, the term “machine” shall also be taken toinclude any collection of machines that individually or jointly executeinstructions 524 to perform any one or more of the methodologiesdiscussed herein.

The example computer system 500 includes one or more processors(generally, processor 502) (e.g., a central processing unit (CPU) andmay also include a graphics processing unit (GPU), a digital signalprocessor (DSP), one or more application specific integrated circuits(ASICs), one or more radio-frequency integrated circuits (or chipset)(RFICs), a wireless fidelity (WiFi) chipset, a global positioning system(GPS) chipset, an accelerometer (one, two, or three-dimensional), or anycombination of these). The computer system 500 also includes one or morememories such as a main memory 504 and a static memory 506. Thecomponents of the computing system are configured to communicate witheach other via a bus 508. The computer system 500 may further includegraphics display unit 510 (e.g., a plasma display panel (PDP), a liquidcrystal display (LCD)) which may be configured for capacitive orinductive touch sensitivity to allow for direct interaction withsoftware user interfaces through the display 510. The computer system500 may also include alphanumeric input device 512 (e.g., a keyboard), acursor control device 514 (e.g., a mouse, a trackball, a joystick, amotion sensor, or other pointing instrument), a storage unit 516, asignal generation device 518 (e.g., a speaker), and a network interfacedevice 520, which also are configured to communicate via the bus 508.

The storage unit 516 includes a machine-readable medium 522 on which isstored instructions 524 (e.g., software) embodying any one or more ofthe methodologies or functions described herein. The instructions 524(e.g., software) may also reside, completely or at least partially,within the main memory 504 or within the processor 502 (e.g., within aprocessor's cache memory) during execution thereof by the computersystem 500, the main memory 504 and the processor 502 also constitutingmachine-readable media. The instructions 524 (e.g., software or computerprogram product) may be transmitted or received over a network 526 viathe network interface device 520.

While machine-readable medium 522 is shown in an example embodiment tobe a single medium, the term “machine-readable medium” should be takento include a single medium or multiple media (e.g., a centralized ordistributed database, or associated caches and servers) able to storeinstructions (e.g., instructions 524). The term “machine-readablemedium” shall also be taken to include any medium that is capable ofstoring instructions (e.g., instructions 524) for execution by themachine and that cause the machine to perform any one or more of themethodologies disclosed herein. The term “machine-readable medium”includes, but not be limited to, data repositories in the form ofsolid-state memories, optical media, and magnetic media.

Additional Configuration Considerations

An advantage of the configurations as disclosed is that a fault-tolerantpipeline of tasks can be generated dynamically during runtime. Further,dependencies between tasks are tracked such that certain tasks aredispatched for execution only when any tasks that must be executedbefore those tasks are complete.

Throughout this specification, plural instances may implementcomponents, operations, or structures described as a single instance.Although individual operations of one or more methods are illustratedand described as separate operations, one or more of the individualoperations may be performed concurrently, and nothing requires that theoperations be performed in the order illustrated. Structures andfunctionality presented as separate components in example configurationsmay be implemented as a combined structure or component. Similarly,structures and functionality presented as a single component may beimplemented as separate components. These and other variations,modifications, additions, and improvements fall within the scope of thesubject matter herein.

Certain embodiments are described herein as including logic or a numberof components, modules, or mechanisms, for example, as illustrated inFIGS. 1-3. Modules may constitute either software modules (e.g., codeembodied on a machine-readable medium or in a transmission signal) orhardware modules. A hardware module is tangible unit capable ofperforming certain operations and may be configured or arranged in acertain manner. In example embodiments, one or more computer systems(e.g., a standalone, client or server computer system) or one or morehardware modules of a computer system (e.g., a processor or a group ofprocessors) may be configured by software (e.g., an application orapplication portion) as a hardware module that operates to performcertain operations as described herein.

In various embodiments, a hardware module may be implementedmechanically or electronically. For example, a hardware module maycomprise dedicated circuitry or logic that is permanently configured(e.g., as a special-purpose processor, such as a field programmable gatearray (FPGA) or an application-specific integrated circuit (ASIC)) toperform certain operations. A hardware module may also compriseprogrammable logic or circuitry (e.g., as encompassed within ageneral-purpose processor or other programmable processor) that istemporarily configured by software to perform certain operations. Itwill be appreciated that the decision to implement a hardware modulemechanically, in dedicated and permanently configured circuitry, or intemporarily configured circuitry (e.g., configured by software) may bedriven by cost and time considerations.

The various operations of example methods described herein may beperformed, at least partially, by one or more processors, e.g.,processor 502, that are temporarily configured (e.g., by software) orpermanently configured to perform the relevant operations. Whethertemporarily or permanently configured, such processors may constituteprocessor-implemented modules that operate to perform one or moreoperations or functions. The modules referred to herein may, in someexample embodiments, comprise processor-implemented modules. Forexample, the processor described in FIGS. 4A-4B may be embodied assoftware.

The one or more processors may also operate to support performance ofthe relevant operations in a “cloud computing” environment or as a“software as a service” (SaaS). For example, at least some of theoperations may be performed by a group of computers (as examples ofmachines including processors), these operations being accessible via anetwork (e.g., the Internet) and via one or more appropriate interfaces(e.g., application program interfaces (APIs).)

The performance of certain of the operations may be distributed amongthe one or more processors, not only residing within a single machine,but deployed across a number of machines. In some example embodiments,the one or more processors or processor-implemented modules may belocated in a single geographic location (e.g., within a homeenvironment, an office environment, or a server farm). In other exampleembodiments, the one or more processors or processor-implemented modulesmay be distributed across a number of geographic locations.

Some portions of this specification are presented in terms of algorithmsor symbolic representations of operations on data stored as bits orbinary digital signals within a machine memory (e.g., a computermemory). These algorithms or symbolic representations are examples oftechniques used by those of ordinary skill in the data processing artsto convey the substance of their work to others skilled in the art. Asused herein, an “algorithm” is a self-consistent sequence of operationsor similar processing leading to a desired result. In this context,algorithms and operations involve physical manipulation of physicalquantities. Typically, but not necessarily, such quantities may take theform of electrical, magnetic, or optical signals capable of beingstored, accessed, transferred, combined, compared, or otherwisemanipulated by a machine. It is convenient at times, principally forreasons of common usage, to refer to such signals using words such as“data,” “content,” “bits,” “values,” “elements,” “symbols,”“characters,” “terms,” “numbers,” “numerals,” or the like. These words,however, are merely convenient labels and are to be associated withappropriate physical quantities.

Unless specifically stated otherwise, discussions herein using wordssuch as “processing,” “computing,” “calculating,” “determining,”“presenting,” “displaying,” or the like may refer to actions orprocesses of a machine (e.g., a computer) that manipulates or transformsdata represented as physical (e.g., electronic, magnetic, or optical)quantities within one or more memories (e.g., volatile memory,non-volatile memory, or a combination thereof), registers, or othermachine components that receive, store, transmit, or displayinformation.

As used herein any reference to “one embodiment” or “an embodiment”means that a particular element, feature, structure, or characteristicdescribed in connection with the embodiment is included in at least oneembodiment. The appearances of the phrase “in one embodiment” in variousplaces in the specification are not necessarily all referring to thesame embodiment.

Some embodiments may be described using the expression “coupled” and“connected” along with their derivatives. For example, some embodimentsmay be described using the term “coupled” to indicate that two or moreelements are in direct physical or electrical contact. The term“coupled,” however, may also mean that two or more elements are not indirect contact with each other, but yet still co-operate or interactwith each other. The embodiments are not limited in this context.

As used herein, the terms “comprises,” “comprising,” “includes,”“including,” “has,” “having” or any other variation thereof, areintended to cover a non-exclusive inclusion. For example, a process,method, article, or apparatus that comprises a list of elements is notnecessarily limited to only those elements but may include otherelements not expressly listed or inherent to such process, method,article, or apparatus. Further, unless expressly stated to the contrary,“or” refers to an inclusive or and not to an exclusive or. For example,a condition A or B is satisfied by any one of the following: A is true(or present) and B is false (or not present), A is false (or notpresent) and B is true (or present), and both A and B are true (orpresent).

In addition, use of the “a” or “an” are employed to describe elementsand components of the embodiments herein. This is done merely forconvenience and to give a general sense of the invention. Thisdescription should be read to include one or at least one and thesingular also includes the plural unless it is obvious that it is meantotherwise.

Upon reading this disclosure, those of skill in the art will appreciatestill additional alternative structural and functional designs for asystem and a process for managing the execution of tasks through thedisclosed principles herein. Thus, while particular embodiments andapplications have been illustrated and described, it is to be understoodthat the disclosed embodiments are not limited to the preciseconstruction and components disclosed herein. Various modifications,changes and variations, which will be apparent to those skilled in theart, may be made in the arrangement, operation and details of the methodand apparatus disclosed herein without departing from the spirit andscope defined in the appended claims.

What is claimed is:
 1. A computer-implemented method for managing theexecution of tasks, the method comprising: receiving from a moduleexecuting in a computer system a first command for queuing tasks forexecution, the first command specifying a first stage of a task pipelinecomprising a first set of tasks; queuing each task in the first set oftasks for execution; receiving from the module a second command forcompleting execution of a first task in the first set of tasks, thesecond command specifying a second stage of the task pipeline comprisinga next set of tasks for execution; and queuing the next set of tasks forexecution upon completion of the execution of each of the first set oftasks.
 2. The method of claim 1, wherein the first command specifies aunique key and a task type associated with each of the first set oftasks.
 3. The method of claim 2, wherein the second command specifiesthe unique key and the task type associated with the first task.
 4. Themethod of claim 1, wherein the first command is a hypertext transferprotocol command, the first set of tasks are specified in a datastructure identified by the first command.
 5. The method of claim 1,wherein the first command specifies that the first stage is serialized,and further comprising: receiving from the module a third set of taskshaving a same type as the first set of tasks; and queuing the third setof tasks for execution before queuing the next set of tasks forexecution.
 6. The method of claim 1, further comprising: receiving athird set of tasks associated with the first stage of the task pipeline;and queuing the third set of tasks for execution upon completion of theexecution of each of the first set of tasks and before queuing the nextset of tasks for execution.
 7. The method of claim 1, wherein the firststage is divided into a first sub-stage comprising the first task and asecond task in the first set of tasks and a second sub-stage comprisinga third task in the first set of tasks, the third task being dependenton the first task and independent from the second task, and wherein thestep of queuing each task in the first set of tasks further comprises:queuing the first task and the second task for parallel execution ondifferent threads; and queuing the third task for execution uponcompletion of the execution of the first task without waiting for thecompletion of the execution of the second task.
 8. The method of claim1, further comprising: determining after a configurable time period ofreceiving the second command that the execution of the first task hasnot completed; and re-queuing the first task for execution.
 9. Themethod of claim 8, further comprising: determining that after aconfigurable time period of the re-queuing the execution of the firsttask has not completed; stopping the execution of the first set oftasks.
 10. The method of claim 1, further comprising periodicallystoring an execution state of each of the tasks in the first set oftasks and the second set of tasks in a database.
 11. A computer readablemedium storing instructions that, when executed by a processor, causesthe processor to manage the execution of tasks, the instructions whenexecuted cause the processor to: receive from a module executing in acomputer system a first command for queuing tasks for execution, thefirst command specifying a first stage of a task pipeline comprising afirst set of tasks; queue each task in the first set of tasks forexecution; receive from the module a second command for completingexecution of a first task in the first set of tasks, the second commandspecifying a second stage of the task pipeline comprising a next set oftasks for execution; and queue the next set of tasks for execution uponcompletion of the execution of each of the first set of tasks.
 12. Thecomputer readable medium of claim 11, wherein the first commandspecifies a unique key and a task type associated with each of the firstset of tasks.
 13. The computer readable medium of claim 12, wherein thesecond command specifies the unique key and the task type associatedwith the first task.
 14. The computer readable medium of claim 11,wherein the first command is a hypertext transfer protocol command, thefirst set of tasks are specified in a data structure identified by thefirst command.
 15. The computer readable medium of claim 11, wherein thefirst command specifies that the first stage is serialized, and theinstructions when executed further cause the processor to: receive fromthe module a third set of tasks having a same type as the first set oftasks; and queue the third set of tasks for execution before queuing thenext set of tasks for execution.
 16. The computer readable medium ofclaim 11, wherein the instructions when executed further cause theprocessor to: receive a third set of tasks associated with the firststage of the task pipeline; and queue the third set of tasks forexecution upon completion of the execution of each of the first set oftasks and before queuing the next set of tasks for execution.
 17. Thecomputer readable medium of claim 11, wherein the first stage is dividedinto a first sub-stage comprising the first task and a second task inthe first set of tasks and a second sub-stage comprising a third task inthe first set of tasks, the third task being dependent on the first taskand independent from the second task, and wherein the instructions thatcause the processor to queue each task in the first set of tasks furthercomprise instructions that cause the processor to: queue the first taskand the second task for parallel execution on different threads; andqueue the third task for execution upon completion of the execution ofthe first task without waiting for the completion of the execution ofthe second task.
 18. The computer readable medium of claim 11, whereinthe instructions when executed further cause the processor to: determineafter a configurable time period of receiving the second command thatthe execution of the first task has not completed; and re-queue thefirst task for execution.
 19. The computer readable medium of claim 18,wherein the instructions when executed further cause the processor to:determine that after a configurable time period of the re-queuing theexecution of the first task has not completed; stop the execution of thefirst set of tasks.
 20. A computer system, comprising: a software moduleexecuting on a processor; and a task execution module executing on ahypertext transfer protocol (HTTP) server, the task execution moduleconfigured to: receive from the software module a first HTTP command forqueuing tasks for execution, the first command specifying a first stageof a task pipeline comprising a first set of tasks, queue each task inthe first set of tasks for execution, receive from the module a secondHTTP command for completing execution of a first task in the first setof tasks, the second command specifying a second stage of the taskpipeline comprising a next set of tasks for execution, and queue thenext set of tasks for execution upon completion of the execution of eachof the first set of tasks.